Skip to main content

Template Archive File Specification

General Description

.p2dta (pdf2data template archive) or .p2d format is a special pdf2Data template archive format containing data and meta-information about templates, reference PDFs etc. It is developed to make simple migration of items (including its versions) between separate installations of Template manager application and usage in pdf2Data SDK.

Both .p2dta and .p2d file extensions are files of basically the same format. When pdf2Data produces a file without name (path) specified, the .p2dta extension is used for full template archive (which is ready for editor), while the .p2d extension is used for minimized template archive (sdk ready). This difference is a hint for the user to help determine template status without reading the file content. pdf2Data does not use the file extension in processing.

Template archive file is a ZIP-compressed file having the following structure on top level:

/  
meta.json
templates/
reference_pdfs/

templates, reference_pdfs, samples are the folders. These folders contain corresponding items content. File meta.json contains information about items and stored templates. The specification of the meta.json file format can be found in Meta of the Template Archive Specification.

templates directory contains all the template files with all their versions. If the archive contains 3 different templates with 4 different template file versions each, then there would be 3*4 = 12 template files inside templates directory. The data from meta.json shall be used to determine the meaning of each file. The specification of the template file format can be found in Template Specification.

reference_pdfs directory contains all PDF files which are used by some template version to extract data from. For more info see sections below.

Common Definitions

JSON Descriptor Simple types:

  • version - the string in "X.Y.Z" or "X.Y.Z-S" format. X, Y, Z are numbers, S is some string
  • date - the string in ISO-8601 date format
  • archive path - the string representing the path to the file within the current template archive from archive root. Path element names regexp: "[a-zA-Z0-9\-\.]+"
  • uuid - the string in UUID format, i.e. 32 hexadecimal (base-16) digits, displayed in five groups separated by hyphens, in the form 8-4-4-4-12 for a total of 36 characters (32 hexadecimal characters and 4 hyphens)
  • enum - the string which shall be equal to one of the predefined values
  • list(X) - the json list with elements of type X

Field required:

  • - the field is required (expected to be present and not null)
  • - the field is optional (can be not present or null)